After data is in a database, it is likely to need to
be accessed, changed, and reported on. To perform these basic
operations, you need to apply the programming constructs of SQL,
specifically Microsoft’s implementation, referred to as Transact-SQL
(T-SQL). Traditional applications can be completely centered on the
four basic SQL commands: SELECT, INSERT, UPDATE, and DELETE.
Essentially, these statements handle every operation that needs to be
performed against the data. The most common of these constructs—the SELECT statement—is the basis for getting data out of the system.
SELECT
statements can be complex and can include the use of options that can
join many tables together and functions that can calculate and
summarize data at the same time. However, a SELECT statement can be as simple as one line of code that retrieves the requested data. The complete SELECT
syntax is involved, with many optional portions. You can find the
complete syntax reference in SQL Server Books Online, under “SELECT,
SELECT (described).” Many of the options are used only under special
circumstances.
Listing the Contents of a Table
You
will often be retrieving all the data from a particular table. Even if
the final query is not intended to get all the data, you can often
begin the data analysis by examining all the rows and columns of data
in a particular table.
SELECT * FROM YUKONTWO.ONE.dbo.Customers
Note that the * is used to obtain all columns from the Customers table. It is also worth noting the use of the four-part name (Server.Owner.Database.Object).
This name includes the server name YUKONTWO, database name ONE, the
owner name dbo, and the name of the table itself, Customers.
Four-part
names are used to perform queries when the one-part name of a table
does not sufficiently qualify the table being queried. If you are
executing a query within the scope of the server itself with the ONE database in use, and you are the owner or the owner is dbo, the four-part name is not necessary. There are therefore several valid variations on queries for the Customers table. Each of the following will produce the same results:
SELECT * FROM Customers
SELECT * FROM dbo.Customers
SELECT * FROM ONE.dbo.Customers
SELECT * FROM ONE..Customers
Although
queries often go after all the data in a table, there are a
considerable number of options available for a query statement. You can
choose some of the columns of a table, provide record-level conditions
to limit the number of rows returned, put the output into groups,
provide group-level conditions to selectively choose the groups, put
the output into sorted order, or produce calculated results. You can
also get into some complex queries through the use of JOIN, UNION, and subquery operations.
Tip
The clauses of a SELECT
query must be provided in the correct order to have valid syntax. As a
mechanism for remembering the order, you can use the following acronym
and phrase: SIFWGHOC (Some Infinitely Funny Winos Get High On
Champagne), for SELECT, INTO, FROM, WHERE, GROUP BY, HAVING, ORDER BY, COMPUTE (BY).
You
do not need to formulate complex queries for the 70-431 exam, so this
chapter covers only the basic theory of the use of the queries, which
is what the exam focuses on.
You
can optionally supply column headers to give a user-friendly listing of
the data. By default, the column headers that are displayed in the
result set are the same as the columns specified in the column select
list, such as CUSTNMBR and CNTCPRSN.
Making a Report More Presentable
Why
not change a result set’s column header to something more readable? You
can change the name of a result set column by specifying the keyword AS. (This is the traditional SQL-92 ANSI standard.) Changing the column name by adding an equals sign (=)
or implied assignment is also an alternative syntax choice. (Of course,
you would normally use only one of these three techniques, and the
industry standard is SQL-92 ANSI.) The following example illustrates
the use of column aliasing:
SELECT CUSTNMBR AS 'Customer Number',
'Customer Name' = CUSTNAME,
CNTCPRSN 'Contact Person',
FROM Employees
SELECT CUSTNMBR AS 'Employee ID',
CUSTNAME AS 'Customer Name',
CNTCPRSN AS 'Contact Person',
FROM Employees
Notice
that the previous column aliases have been enclosed in single quotation
marks. This enclosure is necessary when the column alias includes a
space. The alias name must be enclosed within brackets when the alias
is a reserved SQL Server keyword.
Sometimes
you need to combine two columns together to show the two columns as
one. When you do this, you are using a method called string concatenation.
You can think of concatenation as joining strings together, just as you
can combine words into phrases. The operator used to perform the
concatenation is the plus sign (+).
Using TRIM to Remove White Space
To
create
a single name column, you combine the last name and first name
values. If there are leading or training blanks within the data, you
might want to polish the output a little bit, by using the functions
LTRIM (left trim) and RTRIM (right trim). These functions remove
leading spaces (LTRIM) or trailing spaces (RTRIM). The resulting code would then look like this:
SELECT LTRIM(RTRIM(Address1))
+ ' ' + LTRIM(RTRIM(Address2))
+ ' ' + LTRIM(RTRIM(Address3)) 'Full Address'
FROM Customers
You can create your own function, which you could name TRIM, to eliminate both left and right spaces. This would be a handy feature to have in the product, and here is how it would look:
CREATE FUNCTION dbo.TRIM(@CHARSTRING NVARCHAR(255))
RETURNS NVARCHAR(255)
AS
BEGIN
RETURN (RTRIM(LTRIM(@CHARSTRING)))
END
Returning TOP Rows
The TOP
clause limits the number of rows returned in a result set to a
specified number or percentage at the top of a sorted range. Here are
two examples:
As an alternative to TOP, you can also limit the number of rows to return by using SET ROWCOUNT
N.
The difference between this keyword and TOP is that the TOP keyword
applies to the single SELECT statement in which it is specified. For
example, SET ROWCOUNT stays in effect until another SET ROWCOUNTSET
ROWCOUNT 0 to turn off the option). statement is executed (for example,
You can optionally specify that the TOP keyword is to use the WITH TIES option. In this case, any number of records can possibly be displayed. WITH TIES
displays all records that are equivalent to the last matching element.
If you are looking for the top 10 employees and two employees tie for
10th, 11 or more records are displayed. If the tie is for 9th or a
higher position, only 10 records are listed.
Of
course, after you begin placing data in the desired order, you then
need to group the output and perform calculations based on the groups.
As discussed in the
following section, grouping allows the production of subtotals and also
provides more usable output in applications that require grouped output.
Displaying Groups in Output
You can use the GROUP BY clause of the SELECT
statement to create groups within data. You can then use these groups
to display data in a more orderly fashion or produce more meaningful
results through the use of aggregate functions.
The GROUP BY
clause specifies the groups into which output is to be shown and, if
aggregate functions are included, calculations of summary values are
performed for each group. When GROUP BY
is specified, either each column in any non-aggregate expression in the
select list should be included in the GROUP BY list, or the GROUP BY expression must match exactly the select list expression.
Alert
The GROUP BY option of the SELECT
statement is often coupled with a HAVING clause to provide a condition
against all groups. Also, the ORDER BY clause is almost always present
with GROUP BY to ensure that functions operate correctly.
It is important to note that if the ORDER BY clause is not specified, groups returned using the GROUP BY clause are not in any particular order. It is recommended that you always use the ORDER BY clause to specify a particular ordering of data. Data will still be collected into groups. See the following example:
SELECT Country, Count(DISTINCT City) AS 'Number of Cities'
FROM Customers GROUP BY Country
In
this example, countries are collected together and are placed in the
order chosen by SQL Server (usually ascending). The number of unique
cities is counted and displayed beside the related country. By
supplying the ORDER BY clause, as in
the following example, you sort data into descending sequence, placing
the country with the greatest number of unique cities at the top:
SELECT Country, Count(DISTINCT City) AS 'Number of Cities'
FROM Customers GROUP BY Country
ORDER BY Count(DISTINCT City) DESC
You might not want all groups to be included in the output. To exclude groups from the recordset, you can utilize the HAVING clause, which operates against the groups of data in the same way that the WHERE clause acts against the individual rows. In the example shown in Figure 1, the listing has been narrowed down through the elimination of countries with fewer than three unique cities.
The HAVING clause is similar to the WHERE clause. In a SELECT statement, these clauses control the rows from the source tables that are used to build the result set. WHERE and HAVING
are filters: They specify a series of search conditions, and only those
rows that meet the terms of the search conditions are used to build the
result set. To address how these clauses are used, you must understand
the conditions that can be applied within these clauses.